Segmentation of Touching, Overlapping, Skewed and Short Handwritten Text Lines
نویسندگان
چکیده
Text line segmentation is an inherent part of document recognition system and important preprocessing step for word and character segmentation. Presence of touching or overlapping text lines, short-lines, curvilinear or skewed lines and small or variant gaps between the text lines make the segmentation challenging. These variations cause errors in recognition phase. This paper describes the top-down approach of handwritten text line segmentation. The proposed method begins with core detection. To segment the overlapping components, run-length is used for obtaining the structural knowledge which classifies the components into upper and lower text lines. To segment the short lines and skewed lines, distance metrics and connected component are used recursively. The system was evaluated using 200 images from the IAM database and 100 documents collected from different writers. From the experiments conducted, it was observed that the system has 91. 92% accuracy and imbibes in its reliability.
منابع مشابه
Segmentation of Touching, Overlapping, Skewed and Short Handwritten Text Lines
Text line segmentation is an inherent part of document recognition system and important preprocessing step for word and character segmentation. Presence of touching or overlapping text lines, short-lines, curvilinear or skewed lines and small or variant gaps between the text lines make the segmentation challenging. These variations cause errors in recognition phase. This paper describes the top...
متن کاملA new scheme for unconstrained handwritten text-line segmentation
Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentatio...
متن کاملPerformance of Statistics Based Line Segmentation System for Unconstrained Handwritten Text
Handwritten character recognition is a technique by which a computer system could recognize characters and other symbols written in natural handwriting. Segmentation decomposes the document image into subcomponents like lines, words and characters. To achieve greater accuracy, segmentation and recognition could not be treated independently. Most of the existing line segmentation methods have li...
متن کاملA Survey on Word Segmentation Method for Handwritten Documents
One of the most important and challenging tasks in a handwritten recognition pipeline is the segmentation of handwritten document images into text lines and words. Several problems inherent in handwritten documents such as the difference in the skew angle between text lines or along the same text line, the existence of adjacent text lines or words touching, the existence of characters with diff...
متن کاملSeparation of touching and overlapping words in adjacent lines of handwritten text
This paper reports on a novel technique for the separation of characters and words that are connected through touching or overlapping of characters between adjacent lines of text. The technique employs structural knowledge of handwriting styles where overlap is most frequently observed. The method is shown to work well in the most usual cases and resolve many of the more difficult cases observe...
متن کامل